49,932 results • Page 2 of 999
Hi, I am wondering if there were a BioPython or other tool to help me subset a multiple FASTA file? Thank you
updated 7.7 years ago • ksw
Hello everyone, I am new to 'real' bioinformatics and have been trying to call blastn with the biopython NcbiblastnCommandline() function on a local fasta database. It does not return an error and fails to output a xml...file. My blast programs work and are in the path. I can blast against the database from the terminal and I am able to access the...web based NCBI blast using biopython. My os…
updated 9.8 years ago • chris.richard.rivera
Hello, I have one fasta file with four different sequences in it. The command I am instructed to use is pairwise2 in biopython. How can I set up...a code that performs pairwise global sequence alignment between each pair of sequences in the fasta file? So I would need it to do 1:1, 1:2, 1:3 ..., and again for 2:2, 2:3, 2:4, and so on. I am also using the BioSeq parsing method to begin my code
updated 2.5 years ago • Makayla
Hi: I want to extract one section of a chromosome into a FASTA file, I have two versions, but neither of them work correctly. At the end I want to have a normal FASTA file like this: >chromosome_1_results...ch1.fasta','r') fw=open("c:\\data\\ch1results.fasta",'w') s=0 for record in SeqIO.parse(inFile,'fasta'): fw.write (str(record.seq)[1:((23522552+23660224)/2)+1]) fw.close() …
updated 12.4 years ago • Ma
Hi, everyone! Apologies for the simplistic question, but I'm just getting started with Biopython, and I'm having a lot of trouble using qblast() to blast a text file of DNA sequences in FASTA format. The relevant bit...ValueError: Error message from NCBI: Message ID#24 Error: Failed to read the Blast query: Protein FASTA provided for nucleotide sequence" The sequence I'm providing is certainly …
updated 5.4 years ago • biotim
I wanted to check the no of contigs present in either a FASTA or GBK file, I am aware of algorithms such as CheckM that will allow for this process, however is there a direct code to check...no of contigs in a sequence directly with python or biopython
updated 4.3 years ago • biohacker_tobe
Hi Biostar, I want to use Primer3 via BioPython to automate primer design for a series of sequences in a fasta file. The best examples that I can find on the web (for...the generic_run function from Bio.Application and would appreciate any suggestions on how to pass fasta sequences to eprimer3. thanks, zach cp
updated 12.2 years ago • Zach Powers
files for the old Sanger and 454 FASTA files with a phred score of 40 -- I think that would be a `I` under PHRED-33. (I know there are assemblers...out there that take fasta files inputs along with fastq -- but my in-house makefile and pipeline does not and I really would like to incorporate these...in our current workflow without a lot of changes). There are lots of resources for converting fr…
updated 2.6 years ago • Josh Herr
Python. As an example I want to perform regular expression matching in sequences extracted from a FASTA file. The FASTA files being parsed with Biopython's SeqIO module. In the following code `re.findall` fails to find `iupac...with a string, e.g. 'TTAATT', a match is found. Error = `TypeError: expected string or buffer`. # biopython from Bio import SeqIO # regex library import r…
updated 3.1 years ago • Ian
Trying to run a biopython script in Windows cmd to make a bed file from a draft genome downloaded from NCBI. I get the following error. The headers...appear to be fine. I have used the biopython script many times with success previously. Can someone see what the error is please? C:\Python34>python.exe test.fasta...make_bed_from_fasta.py >test.bed File "test.fasta", line 1 …
updated 7.8 years ago • peri.tobias
I'm trying to use the Biopython wrapper for blastp to download matching protein sequences for some sequences that I have stored on my computer...I would like these matching sequences in FASTA format, similar to how on the web server one can select all sequences producing significant alignment and download...FASTA (aligned sequences)". This was my attempt: from Bio.Blast import NCBIWWW fr…
updated 6.4 years ago • traviata
I am using this biopython script from this [post][1], first answer written by Eric. The post was very old so I am adding a new post for it. the script...take ids from a .txt file and extracts their corresponding sequences from another fasta file. But I've a problem here, the ids i am extracting lies...I tried changing it myself after reading the comments but I think I am doing it wrong. my .txt i…
updated 7.0 years ago • ashish
I am very new to programming in python. I have protein fasta files of species of plants. I would like to filter them based on the number of amino acids each sequence contain. Criteria...sequences >20 amino acids. I am able to get the amino acids bigger than 20 with the resources on biopython cookbook. However,when i try to write them on the file. It gives me error. I am unable to reprodu…
updated 4.0 years ago • adeel.maliks20
Hi. I am a new user of Biopython. I need to extract some sequences from a fasta file based on their coordinates. The file is composed of 10 chromosomes
updated 20 months ago • imaditi1987
Hi, I'm using biopython to BLAST over the internet. However, it only saves 30 results (there are more than 30 results that are under the e-value...how to make that number higher. So my question is, how can you show more results from BLAST using biopython. I'm using NCBIWWW.qblast from BIO.BLAST. from Bio.Blast import NCBIWWW File = "MIF" fasta_string = open(File+".fasta
updated 13.7 years ago • Niek De Klein
output from my app: 1 sequences found {'G': 1, 'I': 2, 'M': 2, 'L': 3, 'P': 1, 'R': 4}): def FASTA(filename): try: f = file(filename) except IOError: print "The file, %s, does not exist" % filename return order = [] sequences = {} counts...aa] = 1 print "%d sequences found" % len(order) print counts return order, sequences x, y = FASTA("file.…
updated 11.3 years ago • 8mazix
Hi, I'm suffering from different sequence formatting problem (FASTA and PIR) , basically I'm using *MODELLER* and its functionality in my biopython scripts. Biopython deals with *FASTA* format...two sequences in FASTA format and then do aln.append(file = 'file.fasta', align_codes='all', alignment_format='FASTA') then after that I did: ``` aln.write...file='5fd1_1fdx_output.fasta', alignme…
updated 2.0 years ago • Moses
Hi, I have been wondering at the correct approach in Python, maybe using Biopython, of parsing a fasta file without having to place it in memory (eg: NOT having to read it to a list, dictionary or fasta...fasta_generator(input_file) # The function I miss with open(output_file) as out_file: for fasta in fasta_sequences: name, sequence = fasta new_sequence = some_function(seque…
updated 5 months ago • Eric Normandeau
I would like to know if it is possible to download the sequence FASTA of a pdb file using biopython
updated 4.7 years ago • henriquezvera.95
I used this piece of biopython codes written by Eric Normandeau to parse several sequences from a Fasta file. The codes worked nicely. I would like...Could anyone kindly please light me for example how to modify this piece of biopython codes written by Eric Normandeau in order to suit my purpose? Thank you very much and have a nice day
updated 9.8 years ago • KJ Lim
Hi! Does someone used PRANK in a Biopython script? I cannot understand the parameters used. I tried this one, but no file with alignments was generated. >&gt...o="aligned", # prefix only! ... f=8, # FASTA output ... notree=True, noxml=True
updated 4.8 years ago • catarina.tserrano
specifically: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/ I have parsed this into files of individual chromosomes. The header for chromosome 1 looks like so: >1 dna:chromosome chromosome:GRCh37:1:1:249250621...249250621 seq_region. strand = 1 ``` My question is, is there anything I can do in Biopython to read these values in? I am just identifying the file I am re…
updated 22 months ago • bfeeny
Hi all; sorry for my lack of knowledge. I'm new to biopython and looking through the manual, there's so much information I don't know where to begin. Basically, my FASTA files look
updated 7.1 years ago • jon.sy.tarn
I have a fasta file containning PapillomaViruses sequences (entire genomes, partial CDS, ....) and I'm using biopython to retrieve entire...genomes (around 7kb) from this files, so here's my code: ``` rec_dict = SeqIO.index("hpv_id_name_all.fasta","fasta") for k in rec_dict.keys(): c=c+1 if len(rec_dict[k].seq...gt;7000: handle=open(rec_dict[k].description+"_"+str(len(rec_dict[k]…
updated 21 months ago • andynkili
of my program is as follows : I use clustalOmega to create a multiple sequence alignment saved in a fasta file. Then I would like to create a sequence logo with WebLogo using Biopython. The thing is that my sequences look like...I try to launch WebLogo, I get an Error : Exception : Unknown Alphabet. I tried to run the same fasta file on the WebLogo server without using Biopython, everything wen…
updated 7.8 years ago • makuba
Hello! I would like to get to know your experiences about the fastest way to parse over a fasta file and count the occurrence of a substring in each Seq object. Currently I am using the Biopython SeqIO.parse method
updated 2.3 years ago • mr.two
from a dataset. Then, I want to use the GenBank assembly accession GCA_001600695.1 to parse out the fasta file for his gene. Below is my code, but did not work. Anyone have a suggestion about what command in biopython I can use? from...Bio import Entrez handle = Entrez.efetch(db="nucleotide", id="GCA_000021505.1", rettype="fasta", retmode="text") print(handle.read
updated 4.8 years ago • ricordo.yan
I have a fasta file formatted as follows: >UPF0471 protein C1orf63 homolog some sequence >WD repeat-containing protein 43 some...gt;transmembrane protein 41A some sequence When I print out record.id or make dictionaries, biopython cannot handle the spaces in the sequence names. What should I do to let biopython recognize the name as whole rather
updated 9.0 years ago • grayapply2009
and have run into an issue I cannot seem to find a solution for online. I am trying to use the Biopython wrapper to access MUSCLE in order to align ~100 sequences of 5-10kb. I believe I have successfully installed MUSCLE...Common options (for a complete list please see the User Guide): -in Input file in FASTA format (default stdin) -out Output alignment in FASTA format …
updated 7.3 years ago • jtbioinfo
Hi all, I'm trying to work out a quick script to extract a set of sequence fasta files from a multifasta and write them all to a new, single fasta file. To elaborate, I've got a proteome, and I want to extract...Hi all, I'm trying to work out a quick script to extract a set of sequence fasta files from a multifasta and write them all to a new, single fasta file. To elaborate, I've got a proteom…
updated 2.6 years ago • lachiemck
I am new in python and I need to find the ORFs of some fasta sequences and I am using SeqIO in biopython. I followed the steps in the tutorial (considering table table = 11 and min_pro_len...100) to get the ORFs in the fasta file given and then compared the results with the results for the same sequence using the Open Reading Frame Finder
updated 3.2 years ago • Harper
Hello, I am trying to follow the [Biopython tutorial and cookbook][1] and store sequences in memory for processing. I downloaded the fasta files recommend in...the [Biopython workshop][2]. I ran the following code and it worked. It printed all of the sequence names from the FASTA: ```py handle = open...NC_000913.faa", "rU") for record in SeqIO.parse(handle, "fasta"): print record.id handle…
updated 16 months ago • ckan91
I'm having a beginners issue, which is I'm not sure what format is required by the motifs module of Biopython. At the moment I have a fasta file which has sequences such as: ``` >nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnGGGAAACGG
updated 2.1 years ago • surka
Hi, I have a fast file with about 500 protein sequences that have been compiled as the result of blast searches. Many of these sequences are...quite similar to each other. I would like to trim this fasta file such that only one copy of these highly similar sequences are left behind. My approach so far has been to use the pairwise...alignment tool in biopython, but this becomes very intractab…
updated 7.7 years ago • sudarshan1993
Hi all, I'm trying to incorporate a regular expression command in a biopython script. This prodcues an error: AttributeError: 'str' object has no attribute 'id' What I would like to do is to match...pattern within a Fasta file and replace the matching characters with other characters. From this: >BA_03462|gyrB Brenneria alni strain NCPPB...with open('outfile_padded.fasta')as f…
updated 4.7 years ago • bsp017
I am using python (3.6)/biopython(1.72) to read sequence files. I have an aligned sequence file in fasta format. >Human ----------------------------MRLRVRLLKRTWPLEVPETEPTL...MKLRVRLQKRTQPLEVPESEPTL-RAHLSQVLLPT-LPSSTDTEHSSLQD-NDQPSL I need to remove the gaps `'-'` from the file and have the result file like this: >Human MRLRVRLLKRTWPLEVPETEPTLRSHLRQSLLCTIPSSTDSEHSS…
updated 5.8 years ago • mdsiddra
I have a .fasta file which formats like this: >NC_045512.2 |Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete...p/710/ Looks like Biopyton can be used to solve it. Yet, I highly doubt it, since in the .fasta file, there are no sequences field / annotation etc to indicate which part of the input file is the sequence part and which...part of the input file i…
updated 3.3 years ago • 2001linana
function that is able to check the secondary structure information for each sequence present in a fasta file, using biopython. Can somebody help me
updated 22 months ago • filipa.simoes93
In biopython emboss applications, is it possible that asequence and bsequence attributes of needle/watercommandline object...be a variable instead of fasta file ? i.e I want to write the sequence either directly or through variable as shown below bseq= "aaatttccggtt" needle_cline
updated 5.2 years ago • Maria
Hi - I am using biopython (open to any other method however) to find sequences in a fasta file that match ID's from a .txt file. Just by looking at...the two files I can see there are matches, but the following script will not find them: #!/usr/bin/env python import sys from Bio import...i unique identifiers in %s" % (len(wanted), id_file) records = (r for r in SeqIO.parse(input_file, "fasta")…
updated 6.3 years ago • arronslacey
Hi all! Please help. I parsed sequences from GenBank, renamed it and saved as a fasta file. >KP821216.1_Bluetongue v_Cameroon_Jan-1982 ATGGCTGCTCAGAATGAGCAACGTCCGGAGCGAATAAAAACGACACCGTATTTAGAGGGA...date_to = 1990 count = 0 for i, record in enumerate(SeqIO.parse("Bluetong_batch_cds.txt", "fasta")): a = record.description[-4:] if date_from <= int(a)…
updated 6.1 years ago • dmitri.ivanovsky
Hello, I am studying the application of blast in biopython. Now a problem is troubling me. I have to create a fasta file to use the function NcbiblastpCommandline( which is similar...A 'hard' way to solve the problem is: When the user submits his sequence, the server creates a file and executes the function of NciblastpCommandline. Analogically, is it possible that it doesn't output as a file, j…
updated 10.4 years ago • Zealseeker
I'm learning biopython and I'm having trouble with reassigning the seq_record.id variable for my sequence. from Bio import SeqIO for...I'm learning biopython and I'm having trouble with reassigning the seq_record.id variable for my sequence. from Bio import SeqIO for seq_record...in SeqIO.parse(file, "fasta"): seq_record.id = name[0] #name[0] is the new ID print(seq_record.id)…
updated 8.0 years ago • QVINTVS_FABIVS_MAXIMVS
Hi, How to count aminoacids from a text file ? My file contains 1200 sequences. After getting amino acid composition, I have to create a Bar plot. How to do these things...in Bio python? I parsed the sequence file in to Biopython. I tried the following code to get amino acid composition. I have to get the composition for each sequence...x = ProteinAnalysis("cc.fasta") // cc.fasta is my sequence…
updated 12.5 years ago • Hpk
Hi, I am trying to extract sequences from a fasta file from a database with a specific organism species keyword from a .txt file containing the relevant headers. Do you...know how I can do this in python as the biopython guide I've looked at basically said "you're screwed if your files aren't .gb". The header looks like this: >VFG000361...returns different keywords as opposed to ju…
updated 3 months ago • joshuazvid
Over the past few days, I've tried many methods to extract subset of FASTA from a multi-FASTA file based on the header IDs. I've tried samtools, hpcgridrunner, biopython and various other fasta...efficiently. Using samtools, it took me days to retrieve a 2 million subset from a 55 million multi-fasta sequence file (8.4G in storage). Today, I finally found a solution to the problem. I wish to sha…
updated 10 months ago • hcwang
Hi, I've been dabbling in biopython for about a year and I recently upgraded to Biopython release 1.59. I've been refreshing my skills with some tutorials...taken verbatim from the Biopython Cookbook) but I always get the error below when I run a for loop and any module from the biopython library: IndentationError...Python/Tutorials/BioParse.py Traceback (most recent call last): File "/Users/…
updated 9.7 years ago • priyasshah
I am using biopython for converting my aligned sequence file from one format to another using `"AlignIO.write"` function. The thing I want...to know is that, 1. When I convert a sequence file given in phylip format to a file of clustal format, the resulted clustal file is of version `CLUSTAL X (1.81)` while I want an...in the form of CLUSTAL W / CLUSTAL 2.1 or so (higher version of clustal). …
updated 5.8 years ago • mdsiddra
to search the keyword “covid” in the nucleotide database and write the top 5 sequence records to a fasta file. I completed the code for question 1, but I am having trouble with saving the sequences in a fasta file. Any advice would
updated 2.5 years ago • Makayla
Hi, if I have a fasta file containing nucleotide sequences or proteines sequences is it possible to get EC number using biopython for example...so I need the fungal database from NCBI but I didn't know how to downloaded or construct it from fasta file using makedb command line and the second point is since I have a large fasta query file to pass I believe that I must
updated 4 months ago • friguiahlem8
49,932 results • Page 2 of 999
Traffic: 2218 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6